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When using incorrect or inaccurate signal models to perform parameter estimation on a gravita¬ 
tional wave signal, biased parameter estimates will in general be obtained. For a single event this 
bias may be consistent with the posterior, but when considering a population of events this bias 
becomes evident as a sag below the expected diagonal line of the P-P plot showing the fraction of 
signals found within a certain significance level versus that significance level. It would be hoped 
that recently proposed techniques for accounting for model uncertainties in parameter estimation 
would, to some extent, alleviate this problem. Here we demonstrate that this is indeed the case. 
We derive an analytic approximation to the P-P plot obtained when using an incorrect signal model 
to perform parameter estimation. This approximation is valid in the limit of high signal-to-noise 
ratio and nearly correct waveform models. We show how the P-P plot changes if a Gaussian process 
likelihood that allows for model errors is used to analyse the data. We demonstrate analytically and 
using numerical simulations that the bias is always reduced in this way. These results provide a way 
to quantify bias in inference on populations and demonstrate the importance of utilising methods 
to mitigate this bias. 


I. INTRODUCTION 

In the coming years it is expected that the advanced 
era ground-based gravitational wave (GW) detectors that 
are now coming online (such as advanced LIGO [T] and 
advanced Virgo m) will begin to make routine measure¬ 
ments of GWs from a variety of sources. Later in this 
decade, pulsar timing arrays could also begin to detect 
sources of nanohertz gravitational waves [MS] and there 
are ambitious plans for a space-based gravitational wave 
detector (eLISA [7]) operating in the millihertz band, 
that with be launched by ESA around 2034. Inferences 
about source parameters in this new era of GW astron¬ 
omy will rely on the availability of detailed signal mod¬ 
els for the sources. The calculation of accurate mod¬ 
els is computationally prohibitive, however, so approx¬ 
imate models will be used for inference, which will, in 
general, lead to biases in the parameter estimates ob¬ 
tained. This can lead one to make incorrect inferences 
about individual sources as well as incorrect inferences 
about astronomical populations of sources. The bias due 
to incorrect models becomes more important for louder 
sources. eLISA is expected to observe the inspiral and 
merger of supermassive black holes at signal-to-noise ra¬ 
tios of 0(10^). The impact of parameter bias has been 
shown to be even more significant in this case |8]. 

A common way to quantify the performance of param¬ 
eter estimation is via the probability-probability (P-P) 
plot. The P-P plot shows the probability that the true 
source parameters will lie in a given confidence inter¬ 
val estimated from the detector data, against the value 
of the confidence interval. In the ideal, unbiased, case 
the P-P should be a diagonal line; i.e., x% of the time 
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the true source parameters should lie with the x% con¬ 
fidence interval. However, there are a variety of ef¬ 
fects that can cause the P-P plot to deviate from this 
ideal. For example, use of a greedy algorithm to build 
a multi-dimensional confidence interval from a kD-tree 
constructed from a random sample of points from a dis¬ 
tribution (this problem was discussed in the context of 
sky-localisation by 0 ), deviations between the waveform 
model and the true signal due to a breakdown of gen¬ 
eral relativity (GR) in the strong field (the case of un¬ 
detectable deviations from GR, the so-called “stealth- 
bias”, was considered in cni), and mis-estimating the 
noise properties of the detector can all cause the P-P plot 
to deviate from a ideal diagonal line. However, the cause 
of biased parameter estimation that we will consider in 
this paper is the presence of inaccuracies in the waveform 
model used to analyse the data jSj. If such a systematic 
error is present the returned confidence intervals from a 
parameter estimation study will be shifted away from the 
true parameters making it less likely that the confidence 
interval contains the true parameters. Therefore the P-P 
plot will “sag” below the ideal diagonal line. 

Recently m the authors proposed a marginalised like¬ 
lihood which uses Gaussian processes (GPs) to fold in 
extra information from a small training set of accurate 
waveforms, e.g. numerical relativity (NR) waveforms. 
Accurate here refers to how well these waveforms rep¬ 
resent solutions of the GR field equations. Numerical 
relativity waveforms are not perfectly accurate, but they 
are the best solutions currently available and inaccuracies 
in them can be folded into the GP analysis. If astrophys- 
ical gravitational waves are governed by a theory other 
than general relativity, these waveforms will not be ac¬ 
curate representations of reality. This will also lead to a 
bias, but one that is harder to quantify without knowing 
the true theory of gravity. Here we proceed assuming GR 
is correct and look only at biases from model uncertain¬ 
ties. Once observations are made this assumption could 
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be revisited if evidence arises for departures from GR. 

The GP marginalised likelihood in general shifts the 
best fit parameters closer to the true parameters and 
broadens the peak in the posterior, making it more likely 
that a given confidence contour contains the true param¬ 
eters. Therefore, it would be expected that parameter es¬ 
timates obtained using the marginalised likelihood would 
exhibit less of a bias, and the P-P plots would exhibit 
less of a “sag”. However, the Gaussian process regression 
(GPR) which underlies the marginalised likelihood makes 
some assumptions about how the error in the waveform 
model varies over parameter space. In this paper, we 
investigate the P-P plots both in the case where these 
assumptions turn out to be correct, and, more impor¬ 
tantly, when they are incorrect. 

There are two main results in this paper. The first is a 
derivation of an analytic expression for the expected sag 
in a P-P plot arising from waveform uncertainties. This is 
derived under the assumption that the waveform error is 
small so that we can use the linear signal approximation. 
The second is that the use of the marginalised likelihood 
constructed via Gaussian process regression to analyse 
data leads to a reduction in the size of the deviation 
from the diagonal line. The sag is removed completely if 
the true waveform errors are drawn from the same model 
used to construct the marginalised likelihood. However, 
even when the errors follow a different distribution, the 
marginalised likelihood leads to a reduction in the sag. 

This paper is organised as follows. Sec. |H] provides a 
recap of and quotes some necessary results about GW 
parameter estimation, and introduces the marginalised 
likelihood. Sec. HI derives analytic expressions for the 
P-P plots for both the standard and marginalised likeli¬ 
hoods for a variety of possible waveform errors. Sec. |IV| 
describes the numerical simulations that were performed 
to back-up the analytic results in Sec. |HI| Finally Sec. 
[V| contains a discussion of the results and concluding re¬ 
marks. 


II. PARAMETER ESTIMATION 


We assume that the source of GWs is fully specified 
by a parameter vector A, and that the true waveform 
model is /i(t; A) (hereafter the dependence of h on time t is 
supressed for clarity). The aim of a parameter estimation 
study given measured data s, is to estimate the posterior 
probability on the parameters, P(A|s). This is given from 
Bayes theorem (Eq.[^ by the likelihood, P(s|A) = T'(A), 
the prior, P(A), and the normalising Bayesian evidence 
Z = /dAP(A)L'(A); 
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As this paper concerns parameter estimation, and not 
model selection, we will not discuss the evidence further. 


since for any given source, this just enters as a normali¬ 
sation factor for the posterior. In the case of stationary, 
Gaussian, additive noise n in the detector the measured 
data is given by s = /i(Ao) +n and the likelihood is given 

by 

L'(A) oc exp (s - h{X)\s - /i(A)^^ , (2) 

Where (j-) denotes the usual noise-weighted inner prod¬ 
uct 


In Eq. Sn{f) is the (two-sided) noise power spectral 
density in the detector. 

In general we do not have access to the true wave¬ 
form model /i(A), at least not at a reasonable computa¬ 
tional cost. Highly, but not totally, accurate NR wave¬ 
forms have recently started to become available [12] , and 
slightly less accurate (but computationaly cheaper) ex¬ 
tended analytic models such as (S)EOBNR [T3j are also 
available. However, these are too computationally expen¬ 
sive to use in routine parameter estimation studies, which 
typically require many thousands of likelihood evalua¬ 
tions. Instead, we must make use of cheaper but less 
accurate waveforms, such as post-Newtonian (PN) [14] 1. 
or numerical “kludge” models m- Denoting the approx¬ 
imate waveform model by H(A), the approximate likeli¬ 
hood obtained when using this model is given by 

L(A) oc exp (|s - FI(A)|s - H(A)^^ . (4) 

In general, posterior distributions obtained from this like¬ 
lihood will not agree with posterior distributions ob¬ 
tained from the exact likelihood in Eq. ([^. Denote by 
Aexact the best fit parameters obtained from Eq. (|^ and 
Aapprox the best fit parameters obtained from Eq. Q. 
If both the waveform difference and the parameter shift 
AA = Aapprox ~ Aexact ar® Small quantities, 0(e), then 
an approximate expression for the shift in the parame¬ 
ters can be found by expanding in e. The shift in best-fit 
parameters to linear order in e was obtained in |8] as 
AA = AAi where 


AA“ = - (E 


-1 


ah 


(<5h(Ao)|5fcH(Ao)), 


( 5 ) 


Eaf, = {daH{X)\dbH{\)^, and da = d/dX‘^\x^Xa- For 
completeness we include a derivation of this result, and 
an extension of it to quadratic order, in Appendix [A| . 

From Eqs. ([^ and (A6) it can be seen that the sys¬ 
tematic shift in parameters caused by using the approxi¬ 
mate likelihood is independent of the signal-to-noise ratio 
(SNR). This fact was observed in [8|, and since the statis¬ 
tical errors that arise from detector noise decrease with 
increasing SNR this means that the systematic shift is 
most important for the loudest sources. 






3 


When using the approximate likelihood in Eq. Q to 
characterise a single source one would usually use the 
condition that the systematic error due to the model un¬ 
certainty is less than the random error arising from noise 
to determine if the model is “good enough”. This condi¬ 
tion ensures that the true parameters will be consistent 
with the posterior — the amount by which the system¬ 
atic error shifts the peak of the posterior is less than the 
typical posterior width. However, whilst this condition 
ensures that the true parameters will always be consis¬ 
tent with the posterior, on average they will be further 
from the centre of the posterior and hence lie at a lower 
significance than they should. This starts to become im¬ 
portant when observing a population of sources (as we 
hope will be the case for Advanced LIGO). Even small 
systematic shifts may lead one to make incorrect infer¬ 
ences about the properties of the population. This can 
be understood by imagining that we observe a NS-NS 
binary with identical astrophysical parameters n inde¬ 
pendent times with Advanced LIGO. The error in the 
combined estimate for the mean mass of the population 
is the error in each measurement divided by y/n. There¬ 
fore even if the systematic model error is insignificant for 
making inferences regarding a single binary it becomes in¬ 
creasingly significant for inferences regarding populations 
as new sources are added. The importance of the model 
errors for LIGO observations of NS-NS binaries was con¬ 
sidered by m- Model error effects could also be seen in 
the parameter estimation analysis of the “big-dog” blind 
injection. In that case, the recovered masses for the com¬ 
pact binary injection were significantly biased (in part) 
by the fact that different signal models were used for 
the injection and parameter estimation [17] . This indi¬ 
cates the importance of considering how to incorporate 
model uncertainties in parameter estimation before the 
advanced detector era begins. A detailed investigation 
of parameter estimation on various injections into data 
from the LIGO/Virgo interferometers and employing a 
range of different models for the analysis was carried out 
in [18] . These results clearly show how the analysis of the 
same data using two different models can give mutually 
inconsistent results. 

The recently proposed marginalised likelihood m) 
attempted to account for the systematic error in the pos¬ 
terior, and hence remove the bias. The approximate like¬ 
lihood is constructed by including information from a 
small training set of accurate waveforms computed of¬ 
fline; 


I?={(A,5/i(A))|i = l,2,...,n} , (6) 

in which 6h{X) = H{X) — h{X) denotes the difference 
between the approximate waveform and the true wave¬ 
form. GPR assumes that the waveform differences in the 
training set are a realisation of a Gaussian process with 
covariance function k{X, X') over the parameter space A. 
Different covariance functions may be considered and the 
evidence for the Gaussian process can be maximised with 


respect to variations in the parameters of the covariance 
function: this process of optimising the covariance func¬ 
tion is called “training”, and it enables the Gaussian pro¬ 
cess to “learn” the properties of the waveform differences 
in V. The Gaussian process, once trained, may then be 
used to interpolate the waveform difference across param¬ 
eter space. As we are not interested in the actual wave¬ 
form difference, but rather in its effect on the posterior, 
the GPR interpolation is used as a prior to analytically 
marginalise over the unknown waveform difference. The 
resulting expression for the marginalised likelihood is m 


£(A) oc 





( 7 ) 


where the GPR quantity /r(A) is the mean waveform dif¬ 
ference and cr^(A) is the error in this GPR estimate; 

fi{X) = fc(Ai, A) inv (jt{Xt,Xj)j Sh{Xj) , (8) 

(t^(A) = fc(A, A) — k{Xi, A)inv (ji{Xi, Xj)j k{Xj, A). (9) 

Eor more details on the technique of Gaussian process 
regression see (for example) jTnU^S] and for more details 
of the marginalised likelihood see m- 


III. ANALYTIC CALCULATION OF THE P-P 
PLOT 


In the limit of high SNR the posterior probability dis¬ 
tribution obtained in the analysis of data from a detector 
will be strongly peaked in the vicinity of the true param¬ 
eters. Within the vicinity of this peak it is reasonable to 
expand both the exact and approximate signal models in 
the usual linear signal approximation (LSA), i.e. 

h{X) = h{Xo) + AA“a,/i(Ao), 

H{X) = H{Xo) + AX^^daHiXo) . (10) 


where Aq denotes the parameter values of the true sig¬ 
nal, A denotes the parameter values at which we want to 
evaluate the signal or likelihood and A A = A — Aq. This 
LSA is the usual approximation made in the derivation 
of the Eisher Matrix and the approximation used in the 
derivation of Eqs. ([^ and (A6). 

We are interested in predicting the “sag” that would 
be expected in a P-P plot. If we use an approximate 
waveform model to compute the posterior, then we would 
expect some bias in the recovered parameters and a sag 
in the P-P plot - on average the true parameters would 
be further away from the peak of the posterior than we 
would expect, and so fewer injections would be recovered 
at a given significance level. 
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A. The exact likelihood 

The exact likelihood, by definition, will give a diagonal 
unbiased P-P plot. However we will re-derive this obvious 
result to shed light on the calculations that follow. 

The Exact Likelihood is given by Eq. The mea¬ 
sured data is assumed to consist of a signal with true pa¬ 
rameters Ao and additive Gaussian noise; s = /i(Ao) -I- n. 

In the limit of high SNR, the difference between two 
nearby signals in parameter space may be expanded us¬ 
ing the LSA, 

L'{X) (xeycp(^^(n —AX^dah^n —A\°‘dah'^ , (11) 

= exp [(n|n) - 2AA“ {n\dah) + AA“AA^S'ah]^ , 

where the exact Fisher matrix is Sab = {dah\dbh). Since 
the Fisher matrix is symmetric by construction, we may 
adopt new coordinates in parameter space AA = Q'^AX’ 
such that the Fisher matrix in these coordinates becomes 
diagonal. Sab = QaQl^pq- This amounts to rescaling 
the coordinate axes such that the iso-probability contour, 
which originally was an n-ellipsoid, becomes an n-sphere. 
Derivatives with respect to the new coordinates will be 
denoted with a tilde, dah = Qadbh. In these new coordi¬ 
nates the likelihood separates to become 


limit when the posterior is narrow. The quantity is 
given by 

='^ (n\da:h'^ = (n\dah) (n\dbh) , (15) 


and is distributed as a random variable with 
N = dim(A) degrees of freedom. The inverse regu¬ 
larised incomplete gamma function is defined via y = 
r(a:, r“^(a;, y)). The quantity on the ordinate axis of a a 
standard P-P plot is the probability that the true param¬ 
eters lie within a given significance, P’(sig < X). From 
Eq. (15) it may be seen that this can be rewritten as a 
cumulative probability of the random variable R^; 


P(sig < A) = 1 - P ( < 2r-^ ( ^, 1 - A 


(16) 


The cumulative distribution function of the distribu¬ 
tion is the regularised Gamma function, P(R? < y) = 
r{N/2,y/2). Using this to evaluate Eq. 16 gives the ex¬ 


pected, unbiased diagonal form of the P-P plot for the 
exact likelihood; 


P(sig < A) = 1 - (1 - A) = A. 


(17) 


L'(A) cx 


n“p(-)( 


AA - 


\da-h 


}) 


( 12 ) 


In order to exploit the spherical symmetry about 
the peak in the rescaled parameters we adopt 
(n-dimensional) spherical coordinates centred 
on the peak; the radial coordinate given by 
— < n\dxh >)'^. The significance 
of the true parameters is given by the volume of the 
posterior that is “closer to the peak”, i.e., that has 
higher posterior weight than the true parameters. 


sig = 


dr ^ exp(—r^/2) 
dr exp(—r^/2) 


= 1 - 


r(f.f) 


= i-rfX^ 

2 2 


where r(x, j/) is the incomplete Gamma function. 


r(a 


,2/)= r 

J V 


p-ie-‘dt, 


(13) 


(14) 


r(x) = r(a:, 0) is the complete Gamma function, and 
r(x, y) is the regularised incomplete gamma function de¬ 
fined via the last equality in Eq. (131. In Eq. (13) the 


assumption has been made that the prior distribution on 
the parameters may be approximated as a constant across 
the width of the peak; this is reasonable in the high SNR 


This diagonal P-P plot is shown in the dotted black curve 
in the left-hand panel of Fig. The fact that the PP 
plot for the exact likelihood is always diagonal follows 
from the definition of the likelihood, and this remains 
true even if the LSA fails. The derivation just presented 
assumes the LSA in order to make it resemble as closely 
as possible the upcoming derivation for the approximate 
likelihood. 


B. The approximate likelihood 


We now move on to the more interesting case when 
we have biased parameter estimation from using the ap¬ 
proximate likelihood. As mentioned in the introduction 
we expect to obtain a P-P plot that is “sagging” below 
the diagonal indicating the bias. We first treat the simple 
case where the waveform model depends on just a single 
parameter, X = 9, where the expression for the P-P plot 
is given in terms of the inverse error function, erf~^(a;). 
A treatment will then be given for the general N dimen¬ 
sional case in which the expression for the P-P plot is 
given in terms of the MarcumQ function, Q]si{x, y), along 
with a illustration of how this reduces to the ID result. 

The Approximate Likelihood is given by Eq. (|^. We 
assume the approximate model is “nearly” correct and 
use the LSA to expand signals that are nearby in param¬ 
eter space. As before, denoting the waveform difference 
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by 6h{X) = H{X) — h{X), we have 


L(A) oc exp ( —- {n — SH^Xq) — AX°'daH\ ... 


= exp - 


2 L 


n — Sh{Xo) 


-2AA“ (n - Sh{Xo)\daH) + AA“AA^Safc 


(18) 

)■ 


where the ellipsis in the right hand entry in the inner 
product denotes a repeat of the left hand entry and the 
approximate Fisher matrix is T,ab = {daH\dbH). As be¬ 
fore coordinates which diagonalise the Fisher matrix Tiab 
may be adopted, which give the following separated ex¬ 
pression for the approximate likelihood, 


L(A) oc ]^exp (^AA"" - <^n-(5h(Ao)|4-ff)) ^ • 

(19) 


1. Example for one dimensional parameter space 


If the wavefor m de pends on only one unknown param¬ 
eter, A = A, Eq. (19) becomes 


m = 


1 




exp 


where 


— - / — I 
cr^ \ dA 


A—Ao 




dff , 

dA Ia=Ao 


in — 6h{Xo) 


diJ, 

^Ia=Ao 


( 20 ) 

( 21 ) 

( 22 ) 


and we have included the correct normalisation of the 
posterior. The true parameter value is at A0 = 0 and 
the points with larger posterior weight than the true pa¬ 
rameters lie in the range 0 < A9 < 2/i when /i > 0 or in 
the range 2/r < A9 < 0 when /r < 0. The signihcance at 
which the true parameters lie is therefore 



1 




exp 


where 


1 


{Ae-pf 


dA0 = erf 



erf(z) 



(24) 


is the usual error function. The quantity /i defined above 
depends on the particular realisation of the noise. We 
want to know the fraction of times, over many realisations 
of the noise, that the true parameters will lie within a 
certain signihcance contour. This is just 


P(sig <X)=P 


V2c 


< erf-^A) 


(25) 


The quantity /r/(-\/2cr) is distributed as a Gaussian with 
mean fl = a (A/i(Ao)|di?/dA) /v^ and variance 1/2 and 
so 

P(sig < A) = ^erf (erf^A) - jl) 

-l-^erf (erf“^(A) -|- /t) . (26) 

In the special case where the approximate waveform 
model and the exact waveform model are the same, we 
have /t = 0 and recover the expected unbiased result from 

Sec. lmAl 

P(sig < A) = A . (27) 


This derivation assumed that fl was constant, but in 
practice this will vary from signal to signal. If we de¬ 
note the probability distribution function for fl over the 
astrophysical population by /(/i), the generalisation of 
Eq. (26) can be seen straightforwardly to be 


P(sig <X)= J 


-erf (erf ^{X)-fi) 
-l-^erf (erf“^(A) -|- fi) 


fifi)dfi .(28) 


2. Parameter space of arbitrary dimension 


We will now generalise the expression for the P-P 
plot sag in a one-dimensional parameter space, given in 
Eq. (26), to arbitrary numbers of parameters. Identical 


manipulations to those performed on the exact likelihood 
yields the same expression for the signihcance obtained 
in Eq. (13), 


- fN 


(29) 


except this time is a random variable given by 
R^ =Y.(n-Sh{Xo)\d^Hy (30) 

X 

= - Sh{Xo)\daH) (^n - Sh{Xo)\dbH) . 

If Sh{X) is constant across parameter space, R^ is now a 
non-central random variable with N degrees of free¬ 
dom and non-centrality parameter 

A = yh{Xo)\daH) yh{Xo)\dbH) . (31) 

As before the expression for the P-P plot is given in terms 
of the CDF of the distribution of the random variable 
R^. The CDF of the non-central distribution is the 
Marcum-Q function, P{R? < y) = Qn^/ fj), 

P(sig <X) = l-p(^R^< 2f-i 1 - (32) 

= 1-Q« fVA,W2f-i (^^,1 -aU33) 
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This is an analytic approximation to the P-P plot in the 
LSA and in the case of a constant waveform difference 
over parameter space; this function is plotted as a dotted 
black line in Fig.[^ In this case the P-P plot always sags 
below the diagonal indicating biased parameter recovery. 

If Sh{X) is not constant over parameter space, the gen¬ 
eralisation of this result takes the same form as Eq. (281, 
but with the term in square brackets replaced by Eq. (33) 


and with /(/i) replaced by the corresponding probability 
distribution function for A. For example, in the case that 
Sh{X) is distributed at different times and at different 
points in parameter space as an uncorrelated, zero-mean 
Gaussian with variance in each component of (i.e. 
dh{Xo) ^ A/’(0, e^)) then the quantities {Sh{Xo)\daH) are 
distributed as N{Q, E) and we see that A is distributed 
as times a distribution with N degrees of freedom 
with probability distribution function 


/(A) = 


1 


I -- 

:A 2 e 2 


2fT{N/2)e^ 

Writing = 2f“^ l — x) we must evaluate 
P(sig < X) 

. _A 

A 2 ^e ^ 


(34) 


2fr(f)e^ 

1 / 


e-5G"+A)/„_^(VAa;)dx 


dA 


JV 

X 2 e 2 


( 2 e)f-ir(f) 7 o 

^oo 

/ y^e“2(1+*^ /jv_^(exj/)dy 

Jo ^ 


da; 


(2e)f-ir(f)(l + e2)Hf Jo 


a; 2 e = 


-JV -152 
y 2 e 2 V Ijv_^ 


xy dy 


2f-ir(f)(l + e2)f Jo 

1 


^/v 

a;^“^e“2(i+.2) 


da; 


2fr(f) Jo 


u2 ^e 2du 


2 ’ 1 + e2 


(35) 


The analytic expression for the P-P plot is therefore very 
similar to the case of the exact likelihood, but with the 
argument of the regularised Gamma function scaled ap¬ 
propriately to give the same result as in the final line 
of Eq. (351. This P-P plot also exhibits a sag below the 
diagonal; see the orange dotted curve in Fig.[T] 

The N dimensional result in Eq. (331 can be shown 

using 


to reduce to the 1 dimensional result in Eq. 
the standard properties of the Marcum-Q function. The 
Marcum-Q function is defined by the integral 


^) — 


roo 

/X\m-1 

/ * 

-j exp 

Jb 

\a/ 




+ a2) 


Ira-l{ax)dx 


= exp 


--{a^ + h^) 


00 ^ 


k—l — r 


in which In{x) is the modified Bessel function of the first 
kind. For A^ = 1, the Marcum-Q function, Eq. (36), can 
also be simplified 


pOO 

Qi(a,b) = y/a / \/xexp 

^ Jb 


= l--(erf 


exp 
b — a 


(a;^ -I- a^) 

2 

{x + a)^ 


2 

■ erf 


I_i/2{ax)dx 

(x — a)^ 

exp 
6 -|- a 


V2 


which follows from I_i{x) = cosh{x)/^/x. When 

IV = 1 the regularised Gamma function becomes 

r(l/2,p2/2) _ dt 


r(l/2) 


J^e-yVi dt 


L 


fl/V2 


dw 


dzj 

= 1 - erf(i?/v^). 


(38) 


Eq. (32) therefore becomes 

P(sig < A) = i [ erf ( erf 


-|-erf erf 



(39) 


as we can identify y = A/2, we recover Eq. (26) as ex¬ 
pected. 


dx 


(37) 


where the second line follows by a change of variable and 
a change in the order of integration, the fourth line fol¬ 
lows from the fact that Qm(a,0) = 1 and the final lines 
follow from another change of variable. We have also 
made use of the integral expression for the Marcum-Q 
function given below. This result can also be obtained 
directly by noticing that the random variable R^, which 
depends on both n and dh{X), is distributed as 1 -I- 
times a random variable with N degrees of freedom. 


C. The marginalised likelihood 

The Marginalised Likelihood is given by Eq. (^, as be¬ 
fore this may be expanded in the LSA. In the high SNR 
limit the posterior is narrow compared to the length scale 
over which the waveform changes. The waveform differ¬ 
ence changes over the same length scale as the waveform. 
The quantity cr^(A) also changes over this length scale. 
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as it is “learnt” by the GP in the procedure of max¬ 
imising the evidence. Therefore in the high SNR limit 


cr^(A) may be approximated as a constant. As before 
coordinates which diagonalise the Fisher matrix may be 
adopted, which give the following separated expression 
for the approximate likelihood, 


£(A) cx JJexp 


/ 


V 


1 (^AX^ - (n + /r(Ao) - Sh{Xo)\d^(H - 

2 1 + ^2 


(40) 


r 


The waveform differe nce is assumed to be a small quan¬ 
tity, therefore in Eq. (40) the derivative {H — n) may 
be replaced by dx{H), as the difference is the product of 
small quantities. Identical manipulations to those per¬ 
formed on the exact and approximate likelihoods give the 
same expression for the significance obtained in Eqs. (13) 


and (29) 


sig = l-f , (41) 


except this time the random variable E? is given by 


= 


1 


1 -I- 0-2 
1 


(n + - 5h{\o)\dxH^ , 


(E-i)“'’ (n + m(Ao) - Sh(Xo)ld,H) (n + /i(Ao) - Sh(Xo)ldi,ff) . 


(42) 

(43) 


The GPR technique assumes that the Sh{X) are dis¬ 
tributed as a Gaussian process across parameter space, 
with zero mean and a covariance estimated from a train¬ 
ing set and any prior knowledge. If this assumption is 
in fact true, and the covariance has been correctly esti¬ 
mated, then the quantity ^{Xq) — Sh{Xo) is distributed as 
a zero mean Gaussian with variance . In this case (per¬ 
haps unsurprisingly) the marginalised likelihood com¬ 
pletely fixes the sag. The new random variable is 
distributed as a random variable with N degrees of 
freedom and using the regularised Gamma function as 
the CDF of this distribution we recover the diagonal P-P 
plot; 

P(sig < X) = 1 - P ^ 2f-i f 1 - a)) ,(44) 


= X. 


(45) 


This case is shown, both analytically and numerically, in 
orange in Fig. 

More interesting is the behaviour in the realistic case 
when 5h{X) is not distributed exactly as the GPR has 
predicted. This case is more complicated because the 
different components that make up the R^ random vari¬ 
able are no longer independent random variables and a 
simple expression for the distri but ion of dh{X) cannot be 
found. In particular, from Eq. (42) it can be seen that R^ 


is the sum of the squares of a noise term, < n\dxH >, a 
GPR term, < ^{Xo)\dxH >, and (minus) a physical term, 


I 

< Sh{Xo)\dxH >. In particular the GPR and physical 
terms are now related because the expression for ^(Xq) 
in Eq. ([^ is a linear combination of the realisations of 
6h{X) in the training set, V. The sag will still be given by 
the analogue of Eq. (28), but this integral will not in gen¬ 


eral be analytically tractable. Instead, we will consider 


such cases numerically in Sec. IV 


As we have seen, in the particular case considered 
above where the waveform difference is distributed as 
assumed by the GPR, the marginalised likelihood com¬ 
pletely removes the systematic bias present in the stan¬ 
dard, approximate, likelihood. In addition, as we will 
see in Sec. EYl even in unfavourable situations the 
marginalised likelihood is often able to remove significant 
portions of the bias. We conclude this section with a dis¬ 
cussion of why it is expected that the bias in parameter 
estimates obtained using the marginalised likelihood will 
usually be less than those obtained using the standard 
likelihood. 


From Eqs. (30) and (42) it can be seen that the condi¬ 
tion for the marginalised ikelihood to yield more biased 
parameter estimates than the approximate likelihood, for 
a particular event, is i?approx < -^gpr/(1+^^('^o))i where 


Rt 


= ^ /n - 5h{XQ)\dxH 


-^GPR ~ ~ ^^(^o) + p{Xo)\dxH^ . (46) 


















These terms both involve a projection onto the space 
spanned by the derivatives, dxH, at the point Aq. Since 
these “tilde” derivatives were constructed to be an or¬ 
thonormal basis, the condition for the marginalised like¬ 
lihood to give worse parameter estimates than the ap¬ 
proximate likelihood can therefore be written as 


n — Sh^Xo) 


< 


n - Sh{Xo) + ^J,{Xo) 


V 


D 


1 -I- (t2(Ao) 


(47) 


where the modulus is taken with respect to the func¬ 
tion inner product in Eq. ([^, projected into the space, 
V, spanned by the derivatives. For this to be satisfied, 
it would be necessary not only for the interpolation to 
have the wrong sign when expressed in the basis dxH 
(i.e. 0 > ( K^o)\dxH ) ( n{\)\dxH )), but also for it 

to be large enough in magnitude to overcome the GPR 
uncertainty ct^(Ao) in the denominator. Moreover, this 
is just for one particular realisation of the noise and true 
waveform parameters. We are really interested in the sag 
that arises when considering a population of events. In 
that case, we would need Eq. ( [47| ) to be true in some aver¬ 
age sense and so the interpolation would have to have the 
wrong sign and be too large for the majority of choices of 
waveform parameters. Although this is technically possi¬ 
ble, it is clear that any reasonable interpolation algorithm 
with decent coverage of the parameter space in the train¬ 
ing set and a reasonable covariance function should vio¬ 
late the above bound on average and therefore yield bet¬ 
ter parameter estimates on average and show a smaller 
sag in the P-P plot than the approximate likelihood. 


IV. NUMERICAL CALCULATION OF THE P-P 
PLOT 

In all of the above calculations the expression for the 
P-P plot was written in terms of the CDF of the distri¬ 
bution of the random variable. This random variable 
is written in terms of a signal inner product of the model 
derivatives, it therefore depends both on the properties 
of the GW source and of the GW detector. By express¬ 
ing our results in terms of we ensure that they remain 
valid for any detector and any source (assuming the LSA 
holds). In the cases considered above where analytic ex¬ 
pression for the P-P plots could be found these can also 
be verified numerically by drawing n values of R^ from 
the relevant distribution and numerically estimating the 
CDF. In cases where an analytic expression for the P-P 
plot can not be found the same procedure can be used to 
investigate the P-P plot numerically. 

First consider the unbiased, diagonal P-P plot obtained 
for the exact likelihood. The analytic expression for this 
P-P plot is given in Eq. 0- A numerical validation of 
this result may be performed by drawing random realisa¬ 
tions of the R^ value in Eq. ( [l^ . It can be seen that R^ 
is the sum of the squares of N standard Gaussian random 


variables < n\dxH >; i.e. a distribution with N de¬ 
grees of freedom. We drew n realisations of R^ from this 
distribution, numericall y es timated the CDF and plotted 
the P-P plot using Eq. (16 1 . The results for n = 10^ and 
V = 4 are shown in the left panels of Fig. (analytic re¬ 
sults shown as a dotted line, numerical results as a solid 
line). Within the scale of fluctuations the numerical re¬ 
sults agree well with the analytic results. The bottom 
left panel of the same figure shows the sag of the P-P 
plot below the diagonal, i.e. sig—P(a; < sig). The values 
n = 10^ and V = 4 will also be used for all subsequent 
numerical calculations in this section. 

P-P plots for the approximate likelihood are shown in 
the centre panels of Fig. for a variety of different dis¬ 
tributions of the waveform difference projected into the 
model derivatives; < Sh{Xo)\dxH >. In the case of a con¬ 
stant distribution, or a zero-mean Gaussian distribution 
the analytic expressions in Eqs. (33) and (351 respec¬ 
tively are shown as dotted lines. For the numerical cal¬ 
culations the procedure followed was first to specify the 
distribution for the < 6h(XQ)\dxH > random variables 
(for example the black curves show results when this is 
a constant). The quantity R^ was then calculated using 
Eq. (301 by drawing a random value from this distribu¬ 
tion and a random value for < n\dxH > from a standard 
Gaussian distribution. The R^ variable was calculated n 
times, the GDF of this variable estimated, and the P-P 
plot calculated from Eq. (32). Different colours in Fig. 
indicate different distributions for < 6h{Xo)\dxH >, the 
specification of these distributions are given in the figure 
caption. 



Approximate Marginalised 

Constant 

0.158 

-0.044 

Gaussian 

0.237 

0.000 

non-central Gaussian 

0.385 

0.263 

Skew non-central Gaussian 

0.426 

0.079 

Poisson 

0.317 

0.235 

Gamma 

0.293 

-0.001 

Correlated 

0.441 

0.308 


TABLE I. Table of the total integrated biases for the curves 
shown in Fig. The integrated bias is defined as the total 
area in the sag, i.e. d(sig) (sig — P{x < sig)). 


P-P plots for the marginalised likelihood are shown 
in the right panels of Fig. Jl] for a variety of different 
distributions of < dh{Xo)\dxH >. In the case of a 
zero-mean Gaussian the analytic expressions in Eq. (45) 
is shown as a dotted line. For the numerical calcula¬ 
tions it is necessary to construct a training set for the 
GPR interpolation. Instead of using GPR to interpo¬ 
late the waveform differences, Sh{X), it is simpler for our 
present purpose to instead interpolate the projections 
of the waveform differences onto the waveform deriva¬ 
tives, i.e., < Sh{X)\dxH >, as these are what appear 
in Eq. (42), . The training set was taken to consist 
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Exact Likelihood Approximate Likelihood 




Marginalised Likelihood 



sig 


Constant 

Gaussian 

^*Non- central 
Gaussian 

^^■Skewed 

non-central 

Gaussian 

IBI Poisson 
Gamma 

■m Correlated 


FIG. 1. P-P plots for parameter estimation using the three likelihoods L'{X), L{X), and -C(A) shown in the three columns 
respectively. In each column the top panel shows a P-P plot whilst the bottom panel shows the “sag”; i.e. the difference 
between the ideal diagonal line and the actual P-P plot. In each panel curves drawn as dotted lines correspond to analytic 
results whilst solid curves are numerical results. The left-hand column shows ideal, unbiased parameter recovery for the exact 
likelihood. In the centre and right-hand columns different colour curves correspond to different distributions of < 5h{Xo)\dxH >. 
The curves in black are for a constant distribution giving a non-centrality A = 2 (A defined in Eq. @). The curves in orange 
are for a zero mean Gaussian distribution with variance 1. The curves in yellow are for a non-central Gaussian distribution 
with mean 4/3 and variance 1. The curves in light-green are for a non-central, skewed normal distributiorQ with location 
parameter 1, scale parameter 1, and skew parameter 1. The curves in dark-green are for a Poisson distribution with mean 
and variance 1. The curves in blue are for a Gamma distribution with shape parameter 1 and scale parameter 1. And finally, 
the curves in purple are for a correlated random walk distribution with mean Gaussian step size 1. In all cases the number 
of parameter dimensions is = 4, and the number of points used for the numerical simulations was n — 10^. The left-hand 
panel clearly shows the exact likelihood does not suffer from any bias, as expected. The centre panel shows that in all cases the 
approximate likelihood suffers from a bias. The right-hand column shows that in all cases the marginalised likelihood reduces 
the bias relative to the approximate likelihood. In the ideal case (shown in orange) of a zero mean Gaussian distribution for 
< 5h{Xo)\dxH > the bias is completely removed. 

^ The PDF of a skew Gaussian distribution with location parameter fi, scale parameter a and skew parameter a is given by 
1^1 + erf(Q:(x — /r)/v/2o')j exp(—(x — /r)^/2a'^) 


of points at A = 1, 2,..., 20 and the actual experimen¬ 
tal realisation at a value Aq = 21. For the majority of 
distributions (constant, Gaussian, non-central Gaussian, 
skew non-central Gaussian, Poisson, and Gamma distri¬ 
butions) shown in Fig. the random variables in the 
training set were drawn independently and interpolated 
using an uncorrelated Gaussian process, i.e. K^j — UfSij. 

(with — 0 


42 


The value was calculated from Eq. 
from Eq. ([^, because of the assumption of an uncor¬ 
related process), the GDF estimated and the P-P plot 
calculated from Eq. (|44 1. 


The assumption of an uncorrelated Gaussian process is 
a conservative assumption. In the absence of correlations 
the Gaussian process regression assumes a “worst-case” 
scenario and returns a mean waveform difference of zero 
(see Eq. (|^). If correlations were present then the GPR 


would return a non-zero estimate for /r and shift the posi¬ 
tion of the posterior peak into better agreement with the 
true value, thus improving the P-P plot. To investigate 
the effect of correlations the hnal numerical calculation 
(labelled as “correlated” in Fig.[^ was performed using a 
random walk distribution. The values of < Sh{X)\dxH > 
at the points A = 1, 2,..., 21 were taken to be a realisa¬ 
tion of a random walk with Gaussian step width a = 1/3. 
The first 20 of these values were taken as the training set 
and used to extrapolate the hnal value. For the GPR 
interpolation a squared exponential covariance function 
k{x,y) = exp((—l/2)(a; — y)^) was used. The squared 
exponential covariance function is not able to accurately 
capture the covariance of the random walk distribution, 
so this again represents a conservative choice to examine 
how the marginalised likelihood performs in the presence 
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of un-modelled correlations. However, even in this un¬ 
favourable case the marginalised likelihood still signifi¬ 
cantly reduces the bias in the P-P plot. 

The purpose of considering such a wide variety of dif¬ 
ferent distributions for the waveform difference is to test 
whether the marginalised likelihood is robust against dif¬ 
ferent types of errors in the waveform models, which are 
not correctly modelled by the Gaussian process. For 
example, the marginalised likelihood assumes the wave¬ 
form difference is a zero mean Gaussian process across 
parameter space, therefore it is perhaps not surprising 
that it performs well in the case of a zero mean Gaus¬ 
sian distribution. However the list of distributions used 
here also test the robustness of the method against non¬ 
central distributions (e.g non-central Gaussian), skewed 
distributions (e.g. skewed Gaussian), one-sided and non- 
Gaussian distributions (e.g. Poisson or Gamma distribu¬ 
tions), and the presence of un-modelled correlations in 
the waveform difference (the random walk distribution). 

By comparing the curves of the same colour between 
the centre and right-hand panels of Fig. it can be seen 
that in all cases the P-P plot for the marginalised like¬ 
lihood exhibits less of a bias, i.e., less of a “sag”, than 
the approximate likelihood. In the ideal case where the 
distribution of the waveform differences is precisely that 
assumed by the GPR, a diagonal, unbiased P-P plot is 
recovered; however the bias is also almost completely re¬ 
moved for several of the other distributions considered. 
In all cases a significant improvement in performance can 
be seen when using the marginalised likelihood in place 
of the standard approximate likelihood. These results 
are summarised in Table |lj which lists the total bias (de¬ 
fined as the area between the sagging curve and the ideal 
diagonal) for all the curves shown in Fig. 


V. DISCUSSION 

The P-P plot provides a way to quantify the bias that 
results when using inaccurate models to perform GW pa¬ 


rameter estimation. For individual sources the system¬ 
atic error in the parameters is independent of the SNR, 
whilst the random errors scale as 1/SNR, and hence the 
bias is most significant for the loudest sources. Even in 
cases where, for each individual source, the systematic er¬ 
ror is small compared to the random error, the bias can 
still be significant when observing populations of sources, 
since the statistical error in a parameter estimated from 
combining a population of sources reduces as I/Vn as 
more sources are added, while the systematic errors re¬ 
main fixed. 

In this paper several analytic expressions have been ob¬ 
tained that predict the sag of the P-P plots that results 
from different distributions of the model error. These re¬ 
sults have been derived within the linear signal approxi¬ 
mation, and are valid to 0( 1/SNR). These analytic ex¬ 
pressions for the P-P plots may be viewed in the same 
spirit as Fisher matrix estimates for the random errors, 
or Gutler and Vallisneri’s [5] expression for the system¬ 
atic error in a single measurement. This latter result has 
also here been generalised (in Appendix 0 to include 
terms of 0(1/SNR^). 

It is now well established that model errors will present 
significant problems for a range of GW sources. The au¬ 
thors recently proposed a novel method for tackling this 
problem; using a modified likelihood constructed using 
Gaussian process regression on a training set of accu¬ 
rate waveforms. In this paper the performance of this 
marginalised likelihood was examined by comparing the 
P-P plots (obtained both analytically and numerically) 
with those obtained from the standard likelihood. In 
particular, it was found that in favourable conditions the 
marginalised likelihood was able to completely remove 
the parameter estimation bias. More importantly, it was 
found that the marginalised likelihood was robust against 
a wide range of un-modelled features in the distribution 
of waveform differences, and in all cases considered out¬ 
performed the standard likelihood. These results provide 
further illustration of the need to account for model un¬ 
certainties (using GPR or other techniques) when draw¬ 
ing inferences from near future GW observations. 
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Appendix A: Systematic bias due to waveform errors 


We assume that an approximate model H{X) is used 
to recover the parameters of a gravitational wave signal 
that is described by the true model h{X) with parameters 


Aq. The best fit parameters of the approximate model are 
Abf = Aq + AA. These parameters minimise the squared 
distance between the true and approximate model spaces, 


(^6h{Xo) + iL(Abf) - H{Xo)\Sh{Xo) + i?(Abf) - i?(Ao)) 

where Sh{Xo) — H{Xo) — h{Xo) (note the different sign 
convention from M)- Differentiating with respect to each 
of the parameters in turn, we find that the best-fit pa¬ 
rameters must satisfy the equations 


{6hiXo) + i/(Abf) - H{Xo)\da (i/(Abf) - i^(Ao))) = 0 . 

(A2) 

We use the notation daX = dx/dX‘^ and subsequently will 
use dabX = d^x/dX^'dX^. If we now assume that the ap¬ 
proximation is good, we can write Sh{X) ^ 0{e), a small 
parameter, and Abf = Aq + AA with AA* ^ 0(e) Vi. We 
can then expand the difference between the approximate 
waveforms as a Taylor series 


iL(Abf) - H{Xo) = daHiXo)AX- + 

^dabH{Xo)AX‘^AX^ + ■■■ . (A3) 


Eq. (A2| 


becomes 


6hiXo) + dbH(Xo)AX^ + -dbcHiXo)AX‘’AX‘^\daH{Xo) + dadH{Xo)AX^ ) = 0 . 


(A4) 


where all derivatives are now evaluated at Aq. Keeping 
only terms of order e we find the Cutler and Vallisneri 


result 


AA? = - (£-1)“" {5h{Xo)\dbH{Xo)) (A5) 


where = {daH(Xo)\dbH(X q)) is the Fisher Matrix. 

We now extend to the next order in e by writing A A° = 
AA?-|-AA 2 , where AA? is the previous solution, Eq. (A51. 
Keeping terms to 0{e'^) we obtain 


AA? = - (E-A 


ab 


{6h(Xo)\dabH iXo))AX\ + {dacHiXo)\dbH{Xo))AXlAX‘i + -(5aiV(Ao)|aabiL(Ao))AA?AA? 


• (A6) 


r 


A suitabl e va lidity criterion for the Cutler and Vallisneri 
formula, (A5), is 


maxa {|AA?/AA?|} < 1. 


(A7) 


mate of the systematic bias and that we can readily ex¬ 
tend this method to higher order in e by including further 
terms in the expansion in Eq. (A3). 


We note also that Eq. (A6) provides an improved esti- 













